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ABSTRACT 


The effects of auditory feedback loss on the speech 
production of eight female subjects was investigated under 
two types of masking agent. White noise and a noise source 
derived from each individual subject's speech were applied 
binaurally at 85 dB above threshold for the individual 
subject. In addition, each masking agent was applied to 
the larynx by an external electromechanical vibrator 
simultaneously with binaural masking. Speech production 
was elicited by having subjects read aloud a standard 
passage of prose, respond orally to written questions and 
to visual stimuli. Spectrographic records were made of 
speech under the reading aloud condition; these data 
were supplemented by oscilloscopic records, qualitative 
evaluation of the subjects' speech, and introspective 
reports obtained from the subjects. 

The autogenic noise was found to be superior to white 
noise with regard to subject comfort, the apparent degree 
of auditory feedback masking achieved, and the degree of 
speech disturbance observed. 

In addition to replicating effects reported by other 
investigators, spectrographic data showed added shaped 
frequency components in the 2.5 k Hz. to 5.5 K Hz. range. 
These were interpreted as indicating laryngo-pharyngeal 
muscle tension leading to changed voice quality under the 
masking conditions. 


It is hypothesized that the segmental articulations 
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are not critically dependent on auditory feedback and 
possibly involve either an unregulated feedback or else 
a closed loop, with a relatively low functional load, 
operating at a relatively high brain level. Voice quality 
characteristics due to laryngeal function appear to involve 
a closed loop fed by auditory feedback and proprioception. 
A second hypothesis is advanced, which distinguishes 
between exogenic and endogenic auditory feedback, in 
an attempt to explain the slow onset of speech changes after 
sudden anacusis as compared with the immediate onset of 
speech change in these experiments. 


Implications for further research are discussed. 
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CHAPTER ONE 


i ANTRODUCTION 


This study is concerned with the contribution made by 
auditory feedback (AF) to the mechanisms of motor control in 
speech. The nature of this contribution might be made more 
clear if we could arrange to inhibit the auditory feedback, 
thus permitting us to observe the ensuing effect on speech 
output. The question then arises whether such an observation 
would be indicative of the function of the feedback, or of 
the actual mechanism of control. 

There are two substantive views on the control of 
speech production. First, that various feedback systems 
are crucial and supply moment-to-moment information about 
ongoing movements: a closed loop system. Secondly, that 
feedback is not functionally important, but that control 
depends upon an invariant central patterning allowing 
anticipatory commands to the speech muscles: an open loop 
system. Recently a third view, the possibility of a mixed 
open and closed loop mechanism, has been widely postulated. 
The experiment reported in this study attempts to assess 
these alternative models so far as auditory feedback is con- 
cerned. One significance of such an attempt lies in the 
tact that it provides a basis for investigating (the role sor 
feedback in rapidly changing motor patterns within the cranial 
neuromuscular system. 

Very little has been reported on the results obtained 


after effective masking of the speaker's own voice from 
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himself. There is, in fact, no agreed criterion for 
elttective, auditory inbibition by noise, although. it, is the 
only available method of inhibiting hearing in the normal 
subject. The ideal method would be to use subjects who had 
suffered traumatic bilateral sensori-neural deafness, and to 
test their voice characteristics in controlled experiments 
at onset and over a period of time. Such an experiment would 
be extremely difficult to arrange. Perhaps due to the in- 
herent difficulties of contriving a laboratory situation to 
Malco daneacilroal condition Of auditoryanhabi ton, much. Of 
the work reported on the masking of auditory feedback fails 
to meet the needs defined for this present study. 

The main objective in this study has therefore been to 
improve the method of masking and to develop a technique which 
might further the attempt to mimic the ideal: a subject with 
intact, normal, language skills but totally inhibited from 
hearing himself speak. 

Physiological and psychological experiments have demon- 
strated the essential importance of sensory feedback in the 
visual, tactile, and proprioceptive modalities in such 
motor activities as walking and eye-hand co-ordination. 
Essentially unsolved problems concern the functional mechanism 
of rapid and finely co-ordinated movements as opposed to 
reflex or relatively slow movements, and whether the control 
mechanisms involved in the cranial neuromuscular system 
parallel those of the skeletal neuromuscular system with 
respect to their control mechanisms. 


Neurophysiological research has addressed such complex 
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problems for several decades. Particular progress has been 
reported in relation to open and closed loop mechanisms in 
motor control with the gradual realization of the role of 
muscle spindles in the gamma motor system. But although some 
physiologists have long regarded speech as the ultimate 
example of human motor co-ordination (Lashley, 1951; Eccles, 
1966), relatively little work has been done in this area. 
This relative neglect is unfortunate since the speech control 
mechanisms bear directly on the two major problems mentioned 
above: control mechanisms in rapidly alternating motor 
patterns, and possible differences between control in the 
cranial and skeletal neuromuscular systems. 

Apart from purely physiological considerations, this 
study bears upon the investigation of language phenomena in 
the context of modern linguistics. "The concern of modern 
linguistic theory has been to describe language in terms of 
grammatical systems which relate sounds to meanings. This 
leads to a concern with phonological units within the grammat- 
ical system. However, grammatical systems describe properties 
of the sentences produced by speakers, and not properties of 
the speakers themselves. Accordingly, it would be a fundamental 
methodological error to assume that a system which correctly 
fits sentences must thereby correctly describe properties 
of the organism. The physiology of motor control mechanisms 
in speech articulation, on the other hand, allows the 
collection of data that are independent of formal linguistic 


considerations and also independent of specific language 
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systems. Thus, they provide a methodologically sound basis 
for evaluating the claims made by formal linguistics with 
regard to language universals as well as the phonologies 
of particular languages. This potentiality of physiological 
research with respect to linguistic theory is not fortuitous. 
Linguistic theory has increasingly directed attention to 
the organism: linguistic competence applies to the speaker 
and not to the sentences he produces. But the investigatory 
methods of formal linguistics do not incorporate either the 
variables or procedures required for these metatheoretic 
considerations. Accordingly, physiological research on 
language phenomena is a necessary condition for the evalua- 
tion of these aspects of contemporary linguistic theory."1 
Finally, an understanding of the physiology of speech 
production is not only of theoretical relevance, but also 
has significant practical importance. There are many dis- 
orders of language which are primarily disruptions of the 
control mechanisms of speech output: the dysarthrias and 
motor aphasias being prime examples. Increased knowledge of 
these basic mechanisms should have important implications 
for therapeutic techniques used in the treatment of such 


disorders. 
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unpublished paper delivered to the Linguistic Association 
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It. REVIEW OF THE LITERATURE 


Feedback in the production of speech originates from 
hearing and 'feeling' one's own vocal and articulatory speech 
output. The obvious components of the feedback are external 
path hearing (air), bone conducted hearing, internal vibra- 
tion, tactile and proprioceptive sensation. 

The recognition that aural self-monitoring is an im- 
portant adjunct to the act of speaking developed slowly 
until the early 1940's when Georg von Bekesy (1942) publish- 
ed some of his pioneering work. Later, the work of Black 
(1951) and of Lee (1950 & 1951) demonstrated that a delay in 
the external auditory feedback produced an effect, mainly in 
terms of intensity, rate and fluency, on the speaker's perfor- 
mance. The fact that in such experiments the delayed sig- 
nal is the air conducted component alone while the internal 
pathways remain unaltered has interesting implications. 
Stromsta (1962) noted that to produce the experimental effect 
the sound pressure level of the delayed external signal must 
exceed that of the unaltered internal signal. He concluded 
from this that both the external and internal pathways are 
of relative importance to the production of speech. Later 
investigators appear to have given little consideration to 
these factors although they are clearly of essential import 
in any attempt to mask the speaker's own voice. 

It was clinically observed that such a delay, imposed 
upon the stutterer, caused relaxation of the characteristic 


blocking in his speech and induced a greater degree of 
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fluency (Van Riper, 1970). Cherry and Sayers (1956) dis- 
cussed the effects of masking the various auditory feedback 
channels in relation to their studies of inhibition of 
stuttering. They concluded that elimination of air conducted 
feedback has little effect on the performance of a stammering 
subject. Blocking air and bone conduction pathways simulta- 
neously, however, generally resulted in complete suppression 
of stammering. 

For the most part, there has been a tendency to con- 
Sider the effect of ambient noise on hearing acuity and this 
in itself has become a major field of enquiry (Miller, 1947). 
Measurements were made of levels of discrimination for one 
tone in the presence of another and of such phenomena as 
recognition of speech or patterns in the presence of various 


kinds of acoustic masking noise. 


A considerable amount of work has been done connected 
with the observed increase of voice intensity in the presence 
of noise. Experimenters in this field span the years from 
Lombard (1911) to Lane (1971). Generally, they interpreted 
this effect as the attempt by the speaker to over-compensate 
for the ambient noise. Ina recent review, Lane and Tranel 
(1971) interpret the increase in intensity in terms of the 
speaker's attempt to maintain intelligibility of verbal 
communication in the face of noise. However, Cherry and 
Savers (1956) have noted that some speakers spontaneously 
maintain normal or sub-normal voice levels during masking. 


This observation, plus the complexity of the phenomenon raises 
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serious doubts about Lane and Tranel's rather simple 
hypothesis - "...the speaker need no more listen to himself 
while speaking than he need speak to himself while listening." 
More significant, perhaps, is the fact that this hypothesis 

is inconsistent with the observed fact that people who can- 
not hear their own speech (the deaf) are unable to maintain 
normal speech production for very long. 

The variety of approaches in masking studies has per- 
haps clouded an important consideration. It is the question 
of whether or not aural self-monitoring of speech (through 
external or internal pathways) is merely an epiphenominal 
adjunct to the speech act, or is in fact an essential com- 
ponent of sensorimotor control of speech. Viewing the 
Lombard sign (i.e. gradual increase in vocal intensity during 
a similar increase in masking intensity) (Lombard, 1911; 
Miller, 1947) as a non-specific phenomenon may explain why 
several authors have dismissed the role of auditory feedback 
as relatively unimportant compared to tactile and proprio- 
ceptive feedback (Lane & Tranel, 1971; Gammon et al, 1971). 

A possibility which has not been discussed in the 

literature is that the observed effects upon motor output 
may, in fact, be the result of the noise input, the masking 
noise itself creating a disturbance of system efficiency. 
In this respect, we should note that in experiments where two 
or more sensory channels are simultaneously inhibited (Ringel 
& Steer, 1963; Gammon, et al, 1971; Silverman & Goodban, 1972) 
the contribution of individual components may have been 


confounded. 
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In view of this, it is necessary to look at the 


available literature in two separate ways. First to 

review the type of auditory masking which has been used, and 
its efficiency in inhibiting self-monitoring of speech. 
Secondly, to discuss the effects reported and how the measure- 


ments of these effects have been interpreted. 


Pacind. 


Masking experiments have mainly been conducted as 
part of a larger plan, e.g. with topical anaesthesia and 
nerve block treatments as added conditions. Masking has also 
been used as a method for investigation of the characteristics 
of stuttering (Cherry & Sayers, 1956; Burke, 1969; MacCulloch 
et al, 1970; Silverman & Goodban 1972). In these studies the 
masking noise itself has been consigned only a minor role 
in the experimental design. The actual acoustic details of 
the masking noise — its amplitude, spectrum, and effective-— 
ness in covering the vocal characteristics of the subject, 
have often not been a major focus of concern. There are 
obvious limitations to the interpretation of data oktained 
from experiments whose treatment plans do not take account of 
variations in these parameters. 

Where the purpose of the experiments has been to 
block the speaker's awareness of his own voice, almost all 
experimenters have used white noise masking. Miller compared 
different types of masking and their effectiveness but 
principally for masking independent noise signals (Miller, 


1947). We discussed pure tone, narrow band, wide hand, 
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white noise, filtered noise, voices, and music, as masking 
agents. His conclusions, however, should also be relevant 

to the problem of feedback masking. He stressed that the 
matching of the spectrum of the masking sound to the spectrum 
of the signal was of utmost importance. The experiments he 
reviewed mainly involved the perception of an external signal 
under conditions of monaural or binaural masking, testing 
response to both "background noise" and to interference. In 
his report of experiments using music as a masking agent, 
Miller made an observation which does not appear to have been 
expanded upon or further investigated within the context of 
masking experiments. He stated: "A very complex masking 
sound is obtained, however, if two or three phonographic 
recordings are played at the same time. The different 
orchestras fill in each others pauses, and the coverage of 
the speech frequencies is more consistently adequate." 

(psn thd) « 


Subsequently to Miller's studies, less emphasis has 


been placed upon various methods of masking the acoustic 
signal. In the investigation of motor control mechanisms in 
speech we typically find that more than one feedback system 
has been inhibited during the course of one experiment. 
Auditory inhibition, therefore, became one treatment of 
several conditions and the acoustic details of the masking 
sound, and the accompanying complex effect upon the subject, 
has received little attention. This, in turn, has led to 


diminished consideration of the ethical issues regarding 
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intensity levels of masking. 


Illustrating this point is the work of Black (1951) 
who subjected 144 college students to two hours of noise at 
110 dB (SPL). He then measured the temporary hearing loss 
induced, and the increase in the subjects' voice levels 
during reading of sentences. Hanley and Steer (1949), cited 
by Lane and Tranel (1971, p.684) also used excessively high 
intensities: 5 dB increments from 100 dB to 130 dB, while 
measuring voice level increases in noise compensation 
experiments. Gammon et al (1971) used 110 dB white noise 
during a multi-sensory blocking experiment. During this time, 
interesting physiological experiments have exposed cats, for 
example, to lengthy periods of intense noise, later revealing 
gross inner ear changes, which would certainly not be de- 
scribed as 'temporary' (Bredberg et al, 1970). Standards 
regarding exposure to noise have been legislated in many 


countries. 


The authors who used high intensity levels were no 
doubt attempting to overcome the known difficulty that a 
speaker is well able to hear himself speak until the masking 
sound exceeds the perception of the internally conducted speech 
wave form. It is important, nevertheless, to be aware, as 
was Miller (1947), that neither pure tones nor white noise, 
however loud, cover the entire spectrum of the internal speech 


wave form. 
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Cherry and Sayers (1956) endeavoured to mask both air 
and bone conducted feedback with a pure tone, and found that 
the loudness of the masking tone had to be "close to the 
threshold of pain" (p. 239) to achieve complete masking. 
They were aware of the uncertainty that their subjects were 
indeed completely masked. In fact, their results caused them 
to continue with a further experiment using white noise in- 
stead of the pure tone, "with its more highly concentrated 
energy". The white noise also approached pain level and the 
results revealed the interesting observation that, for 
stutterers at least, it was essential to mask the very low 
frequency tones of speech, or stuttering would remain. The 
authors proceeded to use filtered noise to try to achieve 
this aim. 

Ringel and Steer (1963) chose their subjects for 
auditory, tactile and proprioceptive masking experiments 
from female students majoring in speech pathology and audio- 
logy. They used binaural white noise at 94 dB (re. 0.0002 
one cml Apparently they did not consider the possibility 
that their subjects might still be aware of their own voice 
with the level and type of masking used. They did not de- 
scribe any method of ascertaining the speakers' auditory 
awareness, and the minimal effects upon speech output which 
they obtained suggest the possibility that their subjects 
were not effectively masked. 

Gammon et al (1971) used eight subjects for an 


experiment involving tactile, proprioceptive and auditory 
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inhibition. They used 110 dB SPL (re. 0.0002 ubar) white 
noise for binaural auditory masking. Again the auditory 
masking agent chosen, white noise, is the easiest to produce, 
but it is not easy to analyze results obtained due to the 
lack of assurance that the subjects actually experienced 
total inhibition of auditory feedback. The very high in- 
tensity used is perhaps an indication that the authors were 
aware that the white noise might not adequately mask their 
subjects, but no reference is made to this problem. 

Other experimenters have reported unspecific auditory 
masking techniques. Ladefoged (1967) describes an "informal" 
experiment in which "the intensity of the (masking) noise 
was adjusted so that the subject could not hear his own 
voice even by bone conduction." A 300 Hz. pure tone was used 
by MacCulloch et al (1970) at an intensity "sufficient to 
render the subjects' own speech inaudible." The subjects 
were children and they were instructed to set the power con- 
trol themselves by indicating when the next set of discrete 
sound pips would be "intolerably loud". Silverman and 
Goodban (1972) used white noise of 94 dB SPL to mask twenty 
undergraduate students, twelve males and eight females. 
Seemingly, no control for the differences in’ the vocal 
quality and character of males and females was made. 

The lower fundamental frequency of the normal male voice 
could readily have resulted in substantial sound energy 
being present below the frequency of the masking sound. It 


seems unlikely, therefore, that the same level of masking 
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effectiveness was achieved with the male subjects as with 
the females. Sussman and Smith (1971), in experiments in- 
volving delayed auditory feedback, set the playback 
intensity of the delayed signal "high enough to mask the 
actual voice of the speaker and any possible bone conducted 
feedback." No evidence was given either as to the masking 
effectiveness or the intensity employed. 

Some inconsistencies and important ambiguities can be 
discerned in all these experiments. Whether or not the 
auditory masking agent was as effective as the analysis of 
the results assume is a major question. The fact that the 
intensities used have so often exceeded reasonable standards 
of ethical procedure is of great importance and relevance 
in the planning of future experiments. Using human beings 
in experimental research imposes limits within which the 
physiological conditions of the experiment must remain. 

There is an abundance of literature reporting on noise ex- 
posure and associated hearing loss, especially with reference 
to industrial safety measures. It is clearly incumbent upon 


experimenters to remain within established standards. 


Part II 


The previously quoted literature falls into two dis- 
LiInct .roups . 

(i) In the first group investigators have used auditory 
masking as one part of a multivariate experiment which also 


involved inhibition of tactile and proprioceptive feedback. 
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The experimenters have collected a large amount of data 
which demonstrated disturbance of voice and articulatory 
characteristics. They correlated such variables as pitch 
change, peak intensities, misarticulations, voice quality 
and rhythm, mean syllable duration, and vowel length. The 
general conclusion in the majority of these studies was 

that auditory feedback disturbance causes some disruption 

to a variety of speech output characteristics in normal 
speakers. There is also general agreement that when two 
feedback systems are disturbed at the same time, the dis- 
ruption to articulatory precision and speech quality 
increases. However, we find papers which report auditory 
masking as being effective but which do not specify the 
intensity levels and spectral characteristics of the masking 
sound. Others report intensity levels but not spectral 
characteristics or apparent masking effectiveness. Spectral 
characteristics have particular significance, for example 
with regard to the sex of the speaker, under auditory feed- 
back masking, but little or no attention has been paid to 
this factor in this first group of experiments. 

(ii) In the second group, investigators have generally 
used auditory inhibition as the main experimental condition. 
This has led to a more concise description of the actual 
techniques of masking. 

Cherry and Sayers (1956) reported experiments on in- 
hibiting stutter by different techniques. They found it 


necessary to mask bone-conducted auditory feedback in order 
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to reduce the low frequency components of speech which were 
known to be monitored through bone conduction pathways 

(von Bekesy & Rosenblith, 1951). They concluded that cer- 
tain stimuli were "undoubtedly the larynx tones of the 
speaker". They experimented with pure tone, white noise, 
and filtered noise as masking signals. Unfortunately, they 
did not describe their instrumental design in detail nor did 
they give precise intensity levels, reporting only that the 
masking sound approached pain level. 

Miller (1947) made several important observations con- 
cerning both the techniques of masking and the effectiveness 
of different acoustic masking signals. He was aware that the 
masking noise must be uninterrupted, and stated: "any 
periodic interruption of a masking sound lowers it's masking 
effectiveness." It should also have a concentration of 
power in the low frequency range if it was to "seriously mask" 
human speech. It followed from his study of various masking 
experiments that the "optimal masking noise has a spectrum 
Similar to the long-interval speech spectrum which is being 
masked." 

The more recent work of Stromsta (1962: 1972) implies 
that speech variations induced by auditory masking are 
possibly peripheral reflections of a "control or synchronizing 
system". The hearing of one's own voice would be a crucial 
part of such a system. Stromsta uses the term "sidetone" to 
refer to auditory feedback signals received by the speaker 


via essentially laryngeomandibular pathways. His numerous 
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measurements of phase and conduction velocities indicate 
important relations between air and bone conduction pathways. 
He suggests that the control, or synchronizing, system may 
involve "(i) a neuromuscular initiation and effector portion, 
and (ii) a sensory feedback portion." The sidetone would 
evidently serve as a factor in the sensory feedback portion. 
Stromsta argues that his research into the nature and function 
of these sidetone pathways has shown that the laryngeal tones 
are an important component of auditory feedback (Stromsta, 
1972). If so, masking of the laryngeal tones may be required 
for effective inhibition of auditory feedback. 

Sussman and Smith (1971) studied the effect of delayed 
auditory feedback upon the dynamics of jaw movement in speech 
production. They concluded that "the speech motor system 
is intricately synchronized with the temporal patterning of 
the auditory feedback of speech". 

In summary, prior to about 1970 experimental results 
tend to be interpreted as indicating that auditory feedback 
does not play a significant role in contributing to motor 
control mechanisms in speech production. Since 1970, we find 
a tendency in the opposite direction; however, neither the 
nature of the disturbance to speech caused by auditory masking 
or the requirements for effective masking have been worked 
Out. 

There is fairly general agreement that auditory mask- 
ing does cause some disturbance to speech output. The in- 
crease in intensity of the voice has been often noted, though 


little significance has been attached to this phenomenon. 
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Other disturbances have been reported by individual authors 
but these have not been well characterized. These have in- 
cluded changes in fundamental frequency, pitch, and speech 
rate. It is known that the auditory-masked speaker tends to 
alter vowel length, to slur, and to omit or substitute 
Syllables and consonants. Implicit in the literature, but 
not discussed in depth, is that the disturbance to speech 
is highly variable from one utterance to another and, per- 
haps, from one speaker to another. The importance of the 
sidetone pathways to the acoustic feedback mechanisms has 
been argued though not closely followed up. The sidetone 
pathways clearly include laryngeal contributions to internally 
conducted hearing, but possible laryngeal contributions to 
proprioceptive feedback should not be overlooked. 

With regard to the masking signal, it is acknowledged 
that in the case of delayed auditory feedback the intensity 
of the delayed external signal must exceed that of the un- 
altered bone-conducted signal. White noise has been found 
to be more effective than pure tones in masking speech, and 
some authors have emphasized the importance of masking the 
low frequency components of the subject's voice. But, in 
spite of the considerable number of reported studies in this 
field, there is no agreed criterion or method for effective 
auditory masking of the speaker's own voice. No agreement has 
been reached regarding the intensity or frequency spectrum of 
noise to be used. The role of the pathways conducting auditory 
stimuli internally from the larynx, for example, has not been 


controlled or accounted for in any system of masking the 
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speaker's own auditory feedback. 

Thus, quite fundamental questions concerning the role 
of auditory feedback in the production of speech remain open. 
While information does exist in the literature to suggest 
that it may be important, the experimental data do not provide 
an adequate basis for detailed conclusions. Important para- 
meters for consideration are the feedback integrator, which 
receives feedback as input, and the final motor output con- 
trol system ultimately reduces to these parameters. But the 
full significance of these parameters is likely to remain 
obscure until we have improved methods of auditory feedback 


Than bLeLon. 
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CHAPTER TWO 


EXPERIMENTAL OBJECTIVES 


As indicated in the Introduction, speech control 
mechanisms provide a significant instance for two problems 
in neuromuscular physiology: 
(1) whether control in the case of skilled, rapidly 
alternating, and finely adjusted movement depends on 
open or closed loop systems, or a mixture of Doth, 
(2) whether the motor control mechanism of the cranial neuro- 
muscular system parallel those found in the skeletal 


muscles. 


In addition to these problems of current physiological 
interest, there is the further significance for physiology, 
as well as for purely linguistic inquiry, arising from the 
motor activity of speech having to be in accordance with a 
predetermined plan that involves the coding, at different 
neurological levels, of information extending from cognitive 
intensions to the final motor execution. Speech production 
is thus a complex, hierarchically organized, phenomenon 
involving a number of neurological levels. 

Thiet heassacte that auditory teedback (AF) iS present 
during normal speech production does not admit any simple, 
prima facie, interpretation. An AF loop could be closed at 
one or more neurological levels, or it could be that AF is 
an incidental epiphenomenon with respect to the entire range 
of functions that underlie speech production. 
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The presence of AF certainly prevents us from treating 
the speech production mechanisms as a straightforward 
phenomenon with respect to problems of modern physiological 
interest such as (1) and (2) above. Because of this com- 
plication relative to control mechanisms in the skeletal 
muscles, it is necessary to assess the contribution of AF 
before we can investigate the speech control mechanisms in 
terms of problems like (1) and (2). 

In principle, we should expect that analysis of effects 
obtained under AF inhibition should throw light on the 
neurological levels that normally depend on AF. The current- 
ly available evidence, however, makes it difficult to under- 
take such analysis. Published reports indicate only slight 
changes in speech during presumed AF inhibition, speech 
output remaining highly intelligible. On the face of it, 
these reports would preclude any major contribution of AF 
to critical control parameters at any level of neurological 
function, thus supporting the view that AF constitutes an 
epiphenomenon of little or no functional significance. How- 
ever, the characteristic dependence on white noise as a 
masking agent makes interpretation of the data a difficult 
task for two reasons 
(a) The slight effects on speech output may be due ito, failure 

to achieve AF inhibition in experiments where the white 
noise used was a weak masking agent relative to the 


spectral characteristics,of the AF. 
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(b) In other experiments, where total AF inhibition is 
claimed, white noise intensity may have been at a 
sufficiently high level to introduce a confounding 
of AF inhibition with a general, non-specific, stress 
phenomenon. Even a completely open loop system might 
exhibit oscillation under a generally disturbing 


sensory input. 


In view of the above considerations, the primary 
objective in the present study was to develop a well 
defined technique which is effective in producing AF 
inhibition without introducing a strongly noxious stimulus. 

In order to meet this joint objective, the same 
intensity level was used for white noise and autogenic 
masking. Thus, questions related to non-specific stress 
and to speech disturbances observed during masking treat- 
ments are referred to a single sound intensity level. 

If, at the same intensity, autogenic masking is at 
least as effective as white noise masking in producing 
AF inhibition, while at the same time autogenic masking 
is less stressful to subjects than white noise masking, 
then the primary objective may be said to have been 
achieved with the consequence that speech disturbances 
observed under autogenic masking can be analyzed with 
improved confidence as to their relevance to phenomena 
of sensory inhibition. 


The data recorded in the present study permit 
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analyses of the kind made by earlier investigators: funda- 
mental frequency distribution, speech rate, phonation 

time ratios, etc; voice intensity distribution is precluded 
because of our attempt, described in Chapter Three, to 
prevent overloading the recording apparatus. 

A large range of articulatory variables has been 
implicated, with varying degrees of consistency, in 
auditory masking experiments. However, it has not 
generally been possible to predict on theoretical grounds 
which variables should be affected by auditory feedback 
masking. Further, consistent analysis and interpretation 
of speech changes observed under auditory masking is 
made difficult by differences in experimental conditions, 
as well as the focus of attention, between different 
investigators. In general, the selection and quantifica- 
tion of speech parameters in relation to problems of 
sensory and motor physiology is a topic that needs 
Major review. 

For the immediate purposes of this thesis, attention 
was restricted to certain articulatory changes noted 
during the pilot experiments and apparent laryngo- 
pharyngeal tension assoicated with changed voice quality. 
This does not appear to have been reported in previous 
studies of auditory masking, though during the preparation 


of this thesis Ringel et al ii (1973) reported added 
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frequency components in the speech of subjects under topical 


anaesthesia of the oral mucosa. 


CHAPTER THREE 


Experimental Methods 


An Design 


Three types of speech output were obtained from four 
adults and from four teenage female subjects under a con- 


trol condition and four conditions designed to mask auditory 


feedback. 


(i) Speech elicitation procedure 

The three types of speech output were chosen to re- 
present a range from relatively automatic to propositional 
speech. On the basis of widely held clinical opinion, com- 
pletely automatic speech might show more resilence to 
sensory inhibition than would propositional speech. How- 
ever, there are difficulties in the experimental use of 
completely automatic speech due to intrinsic habit patterns 
associated with such speech articulation. For this reason, 
subjects were asked to read aloud a standard ten-sentence 
passage of prose to represent a speech form that is relative- 
ly automatic compared with fully propositional speech. 
To represent propositional speech, the subjects were asked 
to respond orally to questions presented in written form, 
and also to describe scenes presented in a set of pictures. 
This material is contained in Appendix A. 

The speech elicitation procedures were introduced 
in a fixed order: 
1. Seven written questions requiring a one-sentence oral 
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response. 

2. A ten-sentence prose passage to be read aloud. 

3. Visual presentation depicted in four different pictures, 
the scenes to be described orally. 

The question answering, as one form of propositional 
speech was presented first to absorb any early tension in 
the subjects and to permit initial adjustment to the mask- 
ing noise. The relatively automatic speech obtained from 
reading aloud was expected to. be least disturbed by masking; 
but since this form of speech would provide the only basis 
for true replication across subjects, it was introduced 
after initial adjustment as the primary data source. The 
second form of propositional speech, oral description of 


scenes, was elicited last. 


(ii) Masking procedures 

The above procedures were followed under two conditions 
of binaural auditory stimulation intended to deprive the 
subjects of AF during their speech production. The first 
masking noise consisted of white noise at 85 dB above the 
individual subject's hearing threshold. The second, auto- 
genic masking, applied at the same intensity level, was ob- 
tained from a reverse playback of four superimposed passages 
read aloud by each subject thus obtaining a masking agent 
sensitive to the spectral characteristics of the individual 
subject's voice. Two additional conditions were introduced 
by repeating the above conditions while simultaneously 


applying the masking signal to the larynx via an external 
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vibrator. In order to restrict the length of the experimen- 
tal sessions, only the oral reading was elicited from the 
subjects for the two conditions involving the laryngeal 
Vibrator. ~As. a control condition,, the, speechs elicitation 
procedures were followed but without the application of a 
masking signal. 


Thus the following conditions were involved: 


I. (A) Binaural application of white noise signal. 
(B) Condition I(A), plus white noise applied to the 
larynx. 
II. (A) Binaural application of autogenic noise signal. 
(B) Condition II(A), plus autogenic noise signal 
applied to the larynx. 


wee CONC LOLECOnd LELON. 


The non-masking or control condition was applied last 
so that any fatigue effects would be less likely to occur 
under the masking conditions. A random draw was made for 
each subject individually to determine the order in which 
the masking conditions, white or autogenic noise, were to 
be administered. Whichever type of noise was selected, 
binaural administration was followed by binaural plus 
laryngeal application of the same masking source; the same 
order was then maintained, binaural followed by binaural 
plus laryngeal, for the second masking noise. 

The experiments reported here were frankly exploratory 
in character; within the framework of the above design 


features, we hoped to collect data relevant to the following 
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questions. 


1. Is there evidence of effective AF inhibition at a comfor- 
table level (85 dB above threshold) of masking noise? 

2. If AF inhibition is achieved, is there evidence of 
associated change in speech production? 

3. Are there differences in the above effects that can be 
associated with differences in the masking conditions 


employed? 


The data utilized in the attempt to answer these 
questions were drawn in part from the speech output of the 
subjects under the conditions summarized above; details of 
these data and their analysis are presented in Chapter Four. 
In addition, introspective reports were obtained from the 
subjects following completion of the experimental sessions. 
These reports concerned qualitative reactions to the experi- 
mental procedures and were intended to provide data relevant 
to the effectiveness of the masking conditions in inhibiting 
AF and to the evaluation of non-specific stress factors 
that might influence the interpretation of the speech output 
data. Further details of these reports are provided later 


inlthiss chapter; *tinesectionps(iii);, andsinoChapter! Four. 


2. subjects 


A total of ten female volunteer subjects took part 
in the experiments. Six were adults who ranged in age from 
19 to 34 years. Four subjects were high school students 


aged 16 years. 
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All subjects were native speakers of North American 
English. Each had speech which by standard clinical 
criterion fell within the normal range, and each had normal 
hearing acuity ascertained by the testing procedures de- 
scribed below. 

Two Of the adult subjects were used in preliminary 
experiments; their data were excluded from the final com- 
parisons because of differences in conditions administered 
to them as compared with the fixed conditions applied to 
the remaining subjects. Thus, the subjects fall into two 
groups of four, and are distinguished by adult versus 


teenage status. 


Se Apparatus and Procedures 


(i) Experimental arrangement 

The laboratory was furnished with a dental chair 
fitted with a head rest which maintained the subject ina 
relatively fixed position for the experiment. The chair 
was so positioned that the experimenter could work behind 
the subject and so that the subject was unable to see the 
instrumentation. A small table in front of the subject 
held the instructions and speech elicitation material 
supported on an easel. Beside this was a box with a small 
red neon lamp. Circuitry in the box triggered the light 
heathe intensity of *thessubject 'shvor1cercausedsthe V-U 
meter of the data tape recorder to exceed its reference 
level. The recording level was set during normal speech 


before the commencement of the experiment. Gross increases 
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in intensity would seriously overload the tape recorder and 
cause unsatisfactory data recordings. The device was used 
to counteract this problem. An adjustable stand mounted on 
the table held a Sennheiser, Model 421 N, microphone. This 
was positioned approximately twelve inches from the subject's 
mouth. For the experiment in which the laryngeal vibrator 
was used, the table served as a stand for a magnetic clamp 
to which the vibrator was attached. 

The masking noise was presented through Telephonics, 
TDH 39 earphones, installed in an Auraldomes Aural Research 
headset. The earphones were driven by a Braun, model AG 
type CSV 250, power amplifier the output of which was moni- 
tored with a Hewlett-Packard, model 400 E, A.C. volt meter. 
An attenuator was incorporated in the line between the 
amplifier and the earphones. A 2-position switch selected 
zero or 63 dB attenuation. The attenuation position was 
used to reduce the sound intensity for the testing of the 
subject's hearing threshold without interference from 
background noise originating in the amplifier. The masking 
signals were obtained from a Braun, model TG 504, tape 
recorder, using, appropriate tapes. In those experiments 
where the laryngeal vibrator was used, a shaped plastic pad 
was held in contact with the subject's neck in the region of 
the thyroid cartilage, “Ajthinystainless steel rod (591/27 
in length and 3/16" in diameter) connected the pad to a 
Bruel and Kjaer, model 4810, Mini Shaker, having a frequency 


range 20 Hz to 18 k Hz. The shaker was driven by the same 
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Signal as the aural masking through a second Braun, model 
AG type CSV 250, power amplifier, having a frequency response 
of 40 Hz. to 15 k Hz + 1 dB. The speech output was record- 
ed on a Teac, model A-7030 tape recorder, at a transport 
speed of 7 1/2 inches per second. 
Cte tL mula 

Two different tapes were used for the auditory masking 
Signal. The white noise tape was recorded using the signal 
from a model 1382, General Radio Random Noise Generator. 
An autogenic noise tape was made for each individual subject. 
The subject read four different passages of prose, each of 
varying style and of about five minutes duration. These 
were all recorded on the same audio tape track, one superim- 
posed on the other. The recordings: were made in a sound 
proof booth, using a Teac, model A-7030 tape recorder, 
transport speed 7 1/2 inches per second, and a Sennheiser, 
model 421 N microphone. Later, when used for masking, the 
autogenic noise tape recording was played backwards. 

In addition, a tape was prepared of 1 k Hz. pure tone 


used for determination of the subject's hearing threshold. 


(iii) Experimental procedure 

An experimental session was held separately for each 
subject, lasting approximately two and a half hours. The 
first task was the recording of the autogenic noise tape, 
as described above. The subject was then seated in the 
dental chair and the experimenter explained the procedure, 


allowing the subject to become accustomed to the chair and 
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to the general laboratory environment. Care was taken to 
explain the purpose of the experiment as simply as possible. 
Prior to the experimental session the subject had been told 
that we were interested in recording people speaking under 
different types of noise conditions. In the laboratory this 
statement was repeated. No suggestion that speech might be 
altered by such a procedure was made. The particular steps 
in the procedure were explained carefully. Thus the risk 
of response bias resulting from the subject's awareness of 
what they were doing was assumed to be minimized. 

A measurement of the subject's hearing threshold was 
then made in the following manner. The subject was asked 
to listen to discrete pips at 1k Hz. played loudly through 
the Auraldomes earphones, but with the headset still on the 
table. This was simply to familiarize the subject with the 
type of sound to expect and to facilitate the explanation 
of the test. It was then explained that the subject would, 
when the headphones were set on the head, hear three faint 
pips in one ear. When the subject was able to barely hear 
the tones, but recognize the three pips, she was asked to 
indicate that she had heard them. The earphones were then 
set comfortably on the subject's ears, and using the attenua- 
tor and the volume control on the Braun amplifier, each ear 
was tested consecutively and the output level at threshold 
measured on the A.C. volt meter. 

At this point in the experiment, the subject was given 
a break for coffee, and was encouraged to freely discuss any 


anxiety that she might feel about the surroundings or the 
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procedure. 

Returning to the laboratory, the subject was again 
seated in the dental chair. The experimenter then determined 
the sound level for masking by first averaging the measure- 
ments taken for each ear. This average was taken as the 
subject's threshold and the masking level adjusted with the 
amplifier volume control so that the A.C. volt meter read 
85 dB above this value. This level was used for both types 
of masking and continuously monitored on the volt meter. 

The type of masking noise, either white noise or auto- 
genic noise, chosen for the first condition was selected 
randomly. The control condition was always the last. To 
facilitate analysis of the data, the data tapes were blocked 
out to permit the same ordering of conditions on each sub- 
ject's speech output tape. The subject was given careful 
oral instructions concerning the red light beside the easel. 
She was told that the red light would flash if she was 
speaking so loudly that it would be difficult to make a good 
recording of her speech, and that it would simply remind 
her to lower her voice. No further emphasis was placed on 
the intensity control device. 

We were aware of disadvantageous side effects of the 
red light. Silverman and Goodban (1972) introduced three 
lights, for example, which would seem to encourage the use 
of a continuous monotone by the speaker. To avoid this, 
one light only was used to prevent overload on the system 


due to sudden increase in vocal amplitude. Obviously, this 
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prevented any accurate measurement of the Lombard er Rep eh 
but as that effect has been well-documented in the past 
(Lane and Travel, 1971), the decision to obtain as good 
recordings as possible was made at the expense of data re- 
levant to voice intensity changes. 

Oral instructions were given initially about the 
material presented for reading and for propositional 
speech; also, prior to the commencement of each section of 
the material, the instructions were presented visually (Appen- 
dix A). The material was always presented in the same 
order, as described earlier in this chapter. As the passage 
of reading would be used primarily for the data analysis, 
this was presented after the questions in order that the 
subject might have some minutes to adapt to the masking 
noise, and for any sudden tension effects to have subsided. 
On the other hand, we had considered the possibility of 
fatigue effects, and to counteract this the subjects were 
given a break between each condition and were encouraged 
to talk without the masking sound disturbance. 

After the verbal instructions, the earphones were 
placed on the subject's ears in readiness for Condition A. 
The masking noise was presented binaurally at the previously 
calculated intensity, 85 dB above the individual hearing 
threshold .for, the. particular.subject.. After,.the, correct 
setting was established the experimenter signalled the 
subject to begin by turning the page of instructions to the 


first section (the questions). Thus, there was a 5-10 
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second presentation of the masking noise prior to the sub- 
ject commencing to speak. 

After a brief, one to two minute, rest we continued 
with Condition B. The laryngeal vibrator was attached to 
the table, using the magnetic clamp, and the angle adjusted 
so that the shaped pad came in contact with the subject's 
neck. A resting position against the headrest of the chair 
maintained the subject's head in a stable position and the 
vibrator pad was then positioned carefully against the 
thyroid cartilage region of the neck, thus ensuring a 
relatively constant pressure. For Condition B, the subject 
was requested to read the prose passage only. The masking 
noise was then presented in the same manner as for Condition 
A through the earphones, and also through the laryngeal 
vibrator. Exactly the same procedure was followed for the 
second masking noise. The control condition, III, was 
conducted in the same way, with the subjects seated in the 
same position and reading the same material, but there was 
no masking sound present. 

Following completion of the experimental procedure 
each subject was asked questions concerning her subjective 
reactions to the experiment: (a) was she able to hear her- 
self speak? (b) which masking noise was preferable, and 
why? (c) which was most effective in preventing her from 
hearing her own voice? (d) was she aware of any difficulty 
in speaking at any time during masking? (e) were any of the 


masking conditions especially notable in this respect? She 
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was also asked about the laryngeal vibrator and whether this 
had been disturbing, whether it had masked her voice more 
effectively than the simple binaural masking, or made no 
difference. Finally, she was asked if the loudness of the 


masking noise had been uncomfortable in any way. 


(iv) Data recording and analysis 

As indicated at the end of Chapter Two, the pilot 
experiments had revealed changes in voice quality associated 
with an apparent laryngo-pharyngeal tension under AF mask- 
ing. Since this had not been previously reported, we 
decided to restrict attention to analysis of the frequency 
spectra in order to obtain physical evidence of these 
changes. The primary instrumentation for the physical 
analyses is described below. 

Sound spectrographs were made from the data tapes 
using a Kaye, model 6061-B Sona-Graph fed from a Teac, 
model A 7030, tape recorder. Both complete displays and 
sections (frequency spectra) were made. The following 
settings were used uniformly throughout, unless stated 
otherwise: linear scale; calibration select, 500 Hz.; 
record range, 80 Hz - 8 k Hz.; wide band; H-S switch 
(providing a high frequency pre-emphasis in the record cir- 
cuits). Sections were taken to give an amplitude versus 
frequency display at a preselected point in time. The 
Sona-Graph settings for this were the same as for the full 
displays, except that the Sectioner was activated and the 


band selector was set for narrow band analysis. 
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High speed ink recordings were made to illustrate 
variations in high frequency speech components. Three 
channels of a model EM 34 Mingograf were used in conjunction 
two Frgkjaer-Jensen Electronics intensity meters and 
a type 400 Audio Frequency Filter to isolate the high fre- 
quency components. The signal was obtained from a Teac 
tape recorder and recorded unfiltered on one channel, and 
after high-pass filtering on the other. The third channel 
traced the fundamental frequency of the subject's voice, 
the data recording passing from the Teac tape recorder to 
a Trans Pitch Meter and into the upper channel of the 
Mingograf recorder. A paper speed of 50 mm/sec was used 
throughout. Studies were made of various representative 
sections of the data by high-pass filter settings ranging 
from 3.9 k Hz. to 8 k Hz. Later recordings were made with 
the filter set at three different settings 3.9 k Hz., 

“ie kezZey Pande oso KHZ.) eLOL (COMparison. 

The laryngeal vibrator created a significant amount 
of noise which was detected by the microphone and appeared 
on the data tapes. While the subject's speech output is 
intelligible, the interfering noise is clearly audible, and 
therefore no instrumental analysis was made of the data 


from the B. conditions. 
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FIGURE 1 


Sound spectrographs of typical additional formants and vocal 
ihe a 
Subject: N.N. 
Upper spectrograph: illustrating the phrase "Ivor 
Dent is the mayor of the city" 
under speech=noise masking. 
Middle spectrograph: phrase, "poor and humble husband" 
in the absence of masking _ 
Lower spectrograph: phrase, "poor and humble man", 
illustrating vocal fry preceding 


the word "man". 
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FIGURE 2 


Sound spectrographs of the word "trees", part of the phrase 
"some trees that overhung a cool stream". A section of the 
frequency spectrum of the vowel sound /i/ in "trees" is seen 
to the right of each display, and the point at which the 


section is taken is indicated by an arrow. 
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Masking conditions as indicated. 
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FIGURE 3 


Sound spectrographs of the word "trees", part of the phrase 
"some trees that overhung a cool stream". A section of the 
frequency spectrum of the vowel sound /i/ in "trees" is seen 
to the right of each display, and the point at which the 


section is taken is indicated by an arrow. 
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Masking conditions as indicated. 
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FIGURE 4 


Sound spectrographs of the word "trees", part of the phrase 
"some trees that overhung a cool stream". A section of the 
frequency spectrum of the vowel sound /i/ in "trees" is 
seen to the right of each display, and the point at which 


the section is taken is indicated by an arrow. 
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EIGURE 


Sound spectrographs of the word "trees", part of the phrase 
"Some trees that overhung a cool stream". A section of 

the frequency spectrum of the vowel sound /i/ in "trees" is 
seen to the right of each display, and the point at which 


the section is taken is indicated by an arrow. 
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FIGURE 6 


Sound spectrographs of the word "trees", part of the 

phrase "some trees that overhung a cool stream". A section 
of the frequency spectrum of the vowel sound /i/ in "trees" 
is seen to the right of each display, and the point at which 


the section is taken is indicated by an arrow. 
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FIGURE 7 


Sound spectrographs of the word "trees", part of the phrase 
"some trees that overhung a cool stream". A section of 

the frequency spectrum of the vowel sound /i/ in "trees" is 
seen to the right of each display, and the point at which 


the section is taken is indicated by an arrow. 
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FIGURE 8 


Sound spectrographs of the word "trees", part of the phrase 
"Some trees that overhung a cool stream". A section of the 
frequency spectrum of the vowel sound /i/ in "trees" is 
seen to the right of each display, and the point at which 


the section is taken is indicated by an arrow. 
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FIGURE 9 


Sound spectrographs of the word "trees", part of the phrase 
"Some trees that overhung a cool stream". A section of the 
frequency spectrum of the vowel sound /i/ in "trees" is 
seen to the right of each display, and the point at which 


the section is taken is indicated by an arrow. 
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Masking conditions as indicated. 
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FIGURE 10 


Sound spectrographs of the phrase "Otters are usually 
nocturnal and elusive" illustrating a prolonged vocal fry 
onset to the first vowel in the word "otters". The upper 
spectrograph is of the control condition and the lower 


spectrograph is of the autogenic noise masked condition. 
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L caseieeeceane decane 


Ua rn ere eben ae 


Smo Nettles todos 


Sram eR OURS II 


\ 
| 


CONTROL 


SPEECH-NOISE MASKED 


5 


o 


: ’ i 
f 7 y 
' ( , i ‘ \ o 
, : ‘a 
4 ‘Lj 

a 
\ | , a : 
’ , 


, U 1 4 
’ 
f 


Ww ao he i. 
} , ; J Ne 4 5 y fi A 
r 4 | | i, | ' | 7 
hy ee | Wife 


JOATHOD 


QIACAM J2/OW-HOII42 


- 
| rer | : 
,. 


CONTROL | 
=“I1u590 eshbosiqe yx? I[so00v sisrtecIii of efigsitporsosqe Hnvoe 


zsqqy saT .moisibroo bexesm seton sinespotus sat ai pai 


Bas suxs x00 Soaned” ai 5 1 7 qeib a POT tore 
é ba wid j ; 7. oes 
‘* 5 * ES 4 i “ bi agate. 
a 5G, ea y | ori oe baal eva 


: 7 4 ; 7 ; ; o in z o< 
cree + VS oe igsipowioeqe 417heo 
at Ory. is hi : 
a PAG ; Af} ar. a i 


. “tO” brow Sri?: mov ywie akaaov 
-afigsiporsosqa stjnso bas teqagU .M.a@ :toefdue 
-tqgsipousosgde tswol .A.F sstostdu2 
" TL ia a, » 
pred ‘| Vi 
| < 
| oe : 
SPELCH~ NOISE MASKED 


FIGURE 


Sound spectrographs to illustrate vocal fry episodes occurr- 
ing in the autogenic noise masked condition. The upper 
spectrograph displays the phrase "Canada, our true and 
native land" under the control condition. For comparison, 
the centre spectrograph displays the same phrase under the 
masked condition, and illustrates the vocal fry on the 

word "land". The lower spectrograph displays the phrase 
"British Columbia vis on... allustrating, a prolonged 


vocal fry. on the words. on. 


Subject: B.M. Upper and centre spectrographs. 


Subject: P.K. Lower spectrograph. 
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FIGURE 12 


Oscillographic recordings which illustrate the increased 
intensity in the mid to high frequency range. Three separate 
recordings are, from top to bottom, control, white-noise 
masked, autogenic noise masked, respectively. The upper 
tracing for each recording shows the fundamental frequency; 
the centre tracing shows intensity above 3900 Hz. after 

high pass filtering; the lower tracing shows the total 


intensity. 
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FIGURE 13 


Oscillographic recordings which illustrate the increased 
intensity in the mid to high frequency range. Three 
separate recordings are, from top to bottom, control, 
white-noise masked, autogenic noise masked, respectively. 
The upper tracing for each recording shows the fundamental 
frequency; the centre tracing shows intensity above 5600 
Hz. after high pass filtering; the lower tracing shows the 


total intensity. 
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CHAPTER FOUR 


Experimental Results 


Pema lictative impressions 


Initial assessment of the data tapes by listening re- 
vealed signs of disturbance in the speech output in all of 
the masking conditions as compared with the control. Gen- 
erally, these were similar to those alterations of voice 
characteristics and of speech articulation noted by other 
authors and discussed in Chapter One. Specifically, num- 
erous misarticulations and omissions of consonants were 
noted, together with both increases and decreases in rate 
and in intensity, the changes in rate leaving an impression 
of abnormal rhythmic properties. There was also a notice- 
able flaccidity, or laxness, of articulation particularly 
involving lip movements. This gave a general slurred 
character to the speech and lack of precision in articulation. 

However, the major impression, gained from listening 
to the data tapes and through observing the subjects under 
the masking conditions, was the generalized laryngo-pharyngeal 
tension and strained voice quality referred to earlier. 
Apparently associated with this phenomenon, and occurring 
notably in the autogenic masking condition, were episodes 
of marked vocal fry: a phenomenon characterized by regular 
opening and closing of the opposed vocal cords, in contrast 
to the normal complex vibratory mode. The laryngo-pharyngeal 
tension appeared to persist during the masking conditions. 


The only exception was one teenage subject whose voice 
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production perhaps reflected her indifferent attitude to 


the experimental procedure. In this one case the tension 


was less marked. 


Zee) SPeCtrograpniic data 

A twelve-second segment of the standard reading 
passage was chosen at random. Sound spectrographic analysis 
of this segment was made for each subject, and condition. 

In the masking conditions these showed the presence of 
additional high frequency formants above the expected for- 
mant configurations for the particular vowel. The episodes 
of vocal fry can also be readily observed in the spectro- 
graphic examples. Neither of these effects were found in 
the control condition. Examples of both may be seen in 
Figure l. 

The following description of the spectrograph re- 
cordings is inserted at this point as a guide in inter- 
preting the results shown in Figures 1 to 11. The ‘complete 
display' sound spectrograph is illustrated in the spectro- 
graphs in Figure 1. The spectrograph displays intensity and 
frequency as a function of time. The horizontal axis is 
time, and the vertical axis frequency in the range 80-8000 
Hz. with the scale defined by the striated bands on the left 
and right margins. Each segment of the scale calibration 
bands represents 500 Hz. Intensity at a given frequency 
and point in time is represented by the density of the 
tracing. When a frequency spectrum is analyzed (referred 


to as a 'Section') as in Figures 2-9., it is displayed 
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on a blanked portion of the spectrograph. The frequency 
spectrum shows amplitude against frequency at a particular 
point in time. This time point is indicated by a small 
arrow in Figures 2-9. The amplitude is represented by the 
rightward magnitude of the tracing, and the frequency scale 
is the same as for the complete display spectrographs, but 
inverted, the low frequencies appearing at the top of the 
spectrograph and the high frequencies at the bottom. 

Sound spectrographs were made of the chosen phrase 
spoken under the three conditions of control, white noise 
masking and autogenic noise masking by each of the eight 
subjects was analyzed. After obtaining the sound spectro- 
graph of the whole phrase, a second procedure was carried 
out using the section analyzer of the Sona-Graph yielding 
the frequency spectrum at a particular point in the recorded 
phrase. This was a section of the vowel /i/ in the word 


/trees/, part of the phrase /Some trees that overhung a 

cool stream/. The vowel /i/ contains a higher proportion 

of high frequency energy-than other vowels; thus /i/ 

was felt to constitute a strong test for the presence of 
atypical high frequency components. As indicated earlier, 

the data recordings of the conditions involving the use of 

the laryngeal vibrator, the B conditions, contained noise that 
originated from the vibrator output, and were not suit- 

able for such analysis. Interesting data were obtained from 


the use af the laryngeal vibrator however, and will be dis- 


cussed at a later point. The sound spectrographs are display- 
ed in Figures 2-9. Each figure presents‘three spectro- 


graphs of the same phrase, plus a section of the vowel 
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/i/, produced under each of the three experimental 

conditions. One figure contains the spectrographs for one 
subject alone. There is a consistent pattern in all subjects. 
In the control condition the frequency spectrum shows charac- 
teristic peaks, usually three, corresponding to the /i/ for- 
mants. There is also a low intensity band in the mid-range 
2.5 k Hz - 3.5 k Hz), decreasing to very low intensity in 

the high frequencies. In the autogenic masking condition, the 
normal /i/ formants are not altered but there is evidence of 
additional activity in the mid to high frequency range 
(approximately 2.5 k Hz — 5.5 kK HZ). In six subjects this 
extra high frequency output is relatively intense and con- 
tains discrete frequency peaks. In the complete display 
these appear as high frequency formants which are atypical 

Of the vowel /i/. Also in six cases there is a 

weaker set of intensity peaks at very high frequencies, (6.5 
k Hz. - 8.0 k Hz.). In the white noise masking condition the 
amount of extra activity seen in the spectrograph is qyeeee 
mediate between the control and autogenic noise masking 
conditions. The appearance of the high frequency formants 

is not accompanied by in the amplitude of normal spectral com- 
ponents under either of the masking conditions. This may be 
observed in the sections which show that the spectra of the 
normal formant grouping for the /i/ vowel remain unaltered 
under the masking conditions; i.e. the /i/ formants are of 
more or less the same intensity as in the control condition. 


Figures 2-5 very clearly show the effects described 


above and are similar for each of those four subjects. The 
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differences between the masking conditions and control 
condition are less dramatic in Big nue Sy (SubjectecoN.)u. 

The pattern however, is similar to that seen in Figures 2-5 
with increased activity in the middle frequency range. In 
Figure 7, the control and the white noise masked /i/ spectra 
demonstrate considerable non-specific high frequency activity. 
This appears to be a reflection of this subject's light, 


breathy, voice prodution. The voice output under autogenic 


noise masking does, however, show the organized broad double 
peak of activity, comparable to Figures 2-5. 

Subject E.K. (Figure 8) was mentioned previously as 
apparently somewhat indifferent to the experimental procedure. 
Her speech became progressively more weakly voiced. The 
spectrographs are not really open to interpretation except 
that the white-noise masked condition does weakly illustrate 
the mid-frequency double peak, but they also show the weak 
voicing. The data tape of the last subject, B.M. (Figure 9) 
was found to contain extraneous noise with a wide frequency 
band, (probably ventilating fan noise). This was especially 
marked on the control condition section, and may be seen in 
Figure 9. In spite of this, the speech-noise masked condition 
does show the characteristic pattern seen in Figures 2-7. 

The two masked conditions show a very marked glottal pulse 
banding in the vowel /i/. ‘This 2s not present at all in the 
control condition. 

Figure 10 illustrates a prolonged vocal fry under 


autogenic noise masking together with the control condition 
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for comparison. Figure 11 shows a further episode of vocal 
fry on the word /land/ under the autogenic noise masking 
condition, together with the control condition for the same 
phrase. Both this phrase, and the third example in Figure 
11 of a prolonged vocal fry on the word /on/, are of pro- 


positional speech, rather than prose reading. 
3. Oscillographic data 


Two oscillographic recordings are included because 
they demonstrate the increased intensity in the mid-frequency 
range. Changes in overall rate between the conditions are 
also evident in the varying length of the segments of 
voicing seen on the uppermost tracing (Figures 12 and 13). 
The changes in mid-frequency intensity are observable in the 
recordings of Subject N.N. but were not uniformly seen in 
all the oscillographic as opposed to the speetrographic re- 
cordings. It is much easier to separate out the voiceless 
high frequency components (e.g. fricerve consonants) from 
the voiced components (e.g. vowels) in the spectrographs. 

Figures 12 and 13 are recordings of Subject N.N. 
Figure 12 shows fundamental frequency on the upper tracing, 
total intensity on the lowest, and intensity ere Bn 9 her zie 
on the centre tracing. Figure 13 is a similar recording of 
the same phrase with the central tracing displaying in- 


tensity above 5.6 k Hz. 
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4., Introspective reports 


Auditory masking techniques and the effects they 
cause are not well defined by current analytic techniques, 
and the objective results of this preliminary study bear 
only on a particular phenomenon, the tension effect. 
Accordingly, it was considered important to include intro- 
spective data reported by the subjects for the light they 
might shed on future experimental directions. Auditing 
the data tapes leaves certain impressions which have not 
been quantified. 

In the immediately post experimental interviews, all 
subjects reported total loss of auditory feedback under the 
autogenic masking, and all reported some auditory feedback 
under white noise masking. They all also expressed a strong 
preference for the autogenic noise over the white noise. 

The white noise was actively disliked by all subjects, 
although no one complained that it was distressing in terms 
of causing physical discomfort. Questioned about the 
laryngeal vibrator, three of the teenage group, the four 
adults, and. the, two pilot study adults, stated, that the 
sensation was not uncomfortable, and that they did not 
experience any greater difficulty in producing speech as 

a result of external pressure in the laryngeal area. Several 
subjects, one from the teenage group and two of the adults, 
commented that in the white noise condition, they were unaware 
that the vibrator was working. With the autogenic noise plus 


vibrator condition all the adults, including the two pilot 
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study subjects, and three of the teenage group felt they 
were less able to monitor their own speech output. They 
commented that they felt confused and uncertain whether they 
were articulating correctly although they could still speak. 
These remarks made by the subjects were consonant 
with an evaluation of the data tapes. The use of 
laryngeal vibrator with the white noise masking signal did 
not appear to change the effect associated with the use 
of the air-conducted white noise signal alone. In contrast, 
laryngeal vibration with the autogenic masking signal pro- 
duced greater effects of increased vocal tension and disrupted 
rhythm of speech than either of the two white noise masking 
conditions or the autogenic noise masking without the vibra- 
tor. “Tie subjects” “impressions =that©the autogente*noise 
signal was more effective in masking the auditory feedback 
of their own speech was also the impression gained by 
listening to the speech output on the data tapes. There were 
marked variations in rate, generally decreased under the 
autogenic noise condition and increased under the white 
noise, compared to the control condition. Vowel sounds 
appeared to be lengthened and speech had a more effortful, 
and yet more slurred, quality under the autogenic masking 
condition. 
It was noticeable that among the three adults and two 
pilot study adults who tended to raise the amplitude of 
their voice under masking, none were aware of this. Three 
stated positively that they maintained a normal vocal loudness 


level. Thus it seems that the subjects' changes in voice 
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amplitude were unintentional. None of the teenage group 
raised their voices significantly and it was this character- 
istic that distinguished the two groups most clearly. 

Three teenagers and two adults commented that the 
section containing the picutre description task was by far 
the easiest for them to perform under the masking conditions. 
The reason given was that they knew what they wanted to 
Say. This was an interesting comment, because, on auditing 
the tapes, there seemed to be an increased tendency to 
hesitate, and an increase in voice tension during these 
sections. On the other hand, there is less rhythmic dis- 
ruption in the propositional speech, and it is possible 
that the constraints upon rhythm inherent in a passage read 
aloud made the latter task more difficult under masking condi- 
tions. Two subjects specifically referred to their difficulty 


with "breathing at the right times", especially when reading. 


Se Sup ect variability 


Auditing of the data tapes gave a strong subjective 
impression that each. subject’ showed a consistency of 
type of speech error in spite Of fhe, fact. thac che 
group as a whole was highly variable. For instance, one 
subject might show many articulation errors under the 
masking conditions, whereas another would show very few, 
or even none at all. This inter subject variability in 
these unmeasured responses, such as speech rate, mis- 
articulations, and amplitude constitutes a major problem in 


experiments of this kind. While reports in the literature 
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show highly variable responses, this inter subject origin 
of the variance in measures of speech rate and the like, 


has not been emphasized. 


Ts Summary of results 


In summary, the following physical and introspective 
observations have been made during the course of this 
experiment. 

First, we were able to replicate positive effects 
which have been reported in previous studies on AF inhibition. 
Using the masking agents at a moderate intensity level, 
disruptions of, for example, speech rate, duration, speech 
articulation, and amplitude of the kind reported by earlier 


investigators were noted in our data tapes. 


Secondly, by using a different type of masking agent, 
i.e. the autogenic noise which matches the spectral character 
of the AF, total AF inhibition was reported by subjects. 

The increased physical effects upon speech output under 
autogenic noise masking compared to white noise clearly 
indicates that the autogenic masking was more effective than 
white noise as a masking agent. 

Consideration of the exploratory studies involving the 
laryngeal vibrator reveals indirect evidence of sensory 
input from the laryngeal region interacting with air- 
conducted autogenic masking noise to increase disturbance in 
speech output. This evidence was gained from auditing 
the data tapes and from the introspective reports of- the 


subjects. 
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The major effect reported in this study is the 
Presence of additional mid to high Frequency formants in 
vowels produced under AF inhibition. There is significant 
indication that this increased and organized acoustic 
acoustic energy is present without change in amplitude of 
the normal spectral components. The phenomenon is illustrated 
most clearly in the spectrograph displays of all the eight 
subjects and, to some extent, in the oscillographic recordings. 

Apparently related to the above phenomenon was 
a generalized laryngo—pharyngeal tension under AF inhibition. 
The increased occurrences of vocal fry, under autogenic 
noise masking in particular, provide some physical evidence 
of this tension, and some spectrographic illustrations have 
been included. 

Finally, there were also certain differences in response 
among the subjects, which were mainly individual differences 
manifested in the number of articulatory errors, and alter- 
ations in speech rate, for example. These disruptions were 
largely independent of the age group of the subjects. One 
effect which appeared to be age dependent was the tendency 
for the adults to increase the overall amplitude of their 
voice under AF inhibition. The teenage group did not 
demonstrate this tendency, and teenage subjects tended to 


decrease amplitude under AF inhibition. 
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CHAPTER FIVE 


Discussions and Conclusions 


(i) Experimental Validity 


Questions concerning the relevance of an experimental 
procedure with respect to the phenomena ostensibly under 
investigation are particularly important in such cases as 
the present one. It is doubtful whether our masking treat- 
ments have any real parallel in natural phenomena. Ambient 
noise is not likely to provide the degree of masking over air 
and bone conducted AF that we obtain under these experimental 
conditions. However, the primary evidence in nature for the 
contribution of AF to motor control in speech is in speech 
changes following anacusis with sudden onset in the adult 
or the linguistically viable child. Accordingly, it seems 
appropriate to consider the relevance of the experiments 
reported here to those changes in motor performance which 
characteristically occur in nature with the sudden onset of 
anacusis. 

In anacusis, the observed patterns of voice and articu- 
latory change clearly result from sensory loss while our ex- 
perimental conditions involve the introduction of sensory 


noise. The critical point, however, is that in both cases a 
specific sensory input is lost to the speaker: the acoustic 


feedback of the speech signal. It is on the basis of this 

point that the relevance of the experimental treatment to pre- 
sumed effects of sensory inhibition must rest. This is not to 
assume the absence of response artifacts due to the experimen- 


tal treatment; attempts to isolate such effects are discussed 


below. 
75 
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A second, and important, difference between speech 
effects associated with anacusis and those found in our 
experiments lies in the matter of temporal onset. In 
anacusis speech change is slow in onset, taking several 
months to develop, while in the masking experiments the on- 
set is immediate. At face value, this might suggest a basis 
for invalidating the masking experiments with respect to 
their relevance for the natural phenomena associated with 
sensory inhibition. Such a view might, however, obscure 
the actual complexity of the natural phenomena through a 
misplaced emphasis on questions of experimental validity. 
Instead, we might ask what kind of mechanism could, after 
sudden deafness, support normal speech for a period of 
several months but be unable to continue this support in- 
definitely, and what is there about the experimental condition 
which might inhibit the activity of such mechanisms. 

More generally, in order to examine the validity of 
our experiments in relation to phenomena of AF inhibition 
and to the fundamental problem raised by the occurrence of 
AF, it will be necessary to evaluate a number of hypotheses 
that might provide alternative explanations for the reported 
effects. Throughout the following discussion, it will be 
important to bear in mind a point made in Chapter Two; this 
is the possibility that an AF loop might be closed at one 
or more of the different neurological levels involved in 
speech production. The occurrence of AF presumably enables 
the speaker to monitor his speech output along many dimen- 


sions. In this study interest is focused not on 
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psycholinguistic monitoring, but on physiological sensory 


feedback in relation to presumed direct mechanisms of motor 


controle 


(ii) Frequency Component Effects 


A notable feature of the experimental results is the 
appearance of shaped energy at the 2.5 k Hz. to 5.5 k Hz. 
range. Under white noise masking the prominence of this 
feature lies between that found under autogenic masking, 
where it is most prominent, and that found under the control 
condition, where the experimental effect is absent. 

These findings are different from earlier reports, 
which describe disturbances in articulation, fundamental 
frequency, speech rate, rhythm, and overall intensity. The 
effects we have seen in the form of additional mid to high 
frequency formants in vowels would seem to be evidence of 
induced changes of the vocal production mechanism. Such 
effects could presumably only originate in differences in 
the behaviour of the laryngeal musculature. The normal 
vowel formants, as seen in the sound spectrographs of the 
control condition, are produced by the configurations of 
the vocal tract acting upon the harmonics of the glottal 
wave form. A possible origin of the extra mid to high fre- 
quency formants seen under masking is a sharpening of the 
saw-tooth wave form, giving rise to higher frequency com- 
ponents. The appearance of these 2-5 k Hz. to 5.5 k Hz. 
frequency components is, on the basis of the experiments 


reported here, a constant effect across subjects, though 
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its prominence appears to depend on individual subject 
characteristics. The prominence effect associated with 
different masking conditions, being greater under autogenic 
than white noise at 85 dB above threshold, appears to be 
constant across subjects. 

The episodes of vocal fry appear to be related to a 
generalized laryngo-pharyngeal tension, and could possibly 
be explained as intermittent muscle tension under AF inhibi- 
tion in addition to the increased tension we have suggested 


in the laryngeal muscles. 


(111) Interpretation of the Frequency Component Effects 
It might be argued that the appearance of the added 


frequency components is best understood as a response arte- 
fact resulting from non-specific stress introduced by the 
masking signals. This does not seem likely, however, in view 
of the following considerations. First, we should note that 
the prominence of the added frequency components is greater 
under autogenic masking then under white noise masking. 
Under a response artefact hypothesis, this prominence effect 
should mean that subjects were more stressed by autogenic 
than by white noise. However, introspective reports by 
subjects directly contradict this: subjects unanimously re- 
ported that white noise was more stressful than autogenic 
noise. Thus a response artefact hypothesis would introduce 
a paradoxical relationship between the prominence effect and 
the introspectively reported stress effect. 


A response artefact hypothesis might, however, be 
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proposed to explain the immediate onset of speech disturbance 
under AF masking, as opposed to the slow onset observed 
following anacusis. If this is correct, then we should ex- 
pect incremental improvement in speech production under pro- 
longed treatment as subjects accommodate to the experimental 
stimulus. As noted in Chapter Three, we have not tested 

an accommodation hypothesis, but the response artefact 
hypothesis is nevertheless not entirely satisfactory for the 
reasons already indicated. 

The differential effects of white noise present puzzles 
in relation to the prominence effect for the added frequency 
components. The paradox itself might be reasonably explained 
on the basis of laterality of brain function; Kumura (1973) 
points out that speech signals and noise are processed in 
opposite hemispheres. We might suppose that if white noise, 
because of its lack of organization, is mainly processed in 
the non-dominant hemisphere, while autogenic noise is pro- 
cessed in the dominant, then autogenic noise should disturb 
speech functions more severely than white noise. Under a 
lateralization hypothesis, we might assume that the presence 
of white noise in the non-dominant hemisphere introduces a 
psychologically disturbing condition because it involves 
interhemisphereic stimulus competition. If we assume 
lateralization to be operative, along the lines indicated 
above, and also take into consideration the reported intro- 
spective accounts on the relative weakness of white noise 


as an AF masking agent, then we might account for both the 
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treatment response paradox and the added frequency 
prominence effect, subjective disturbance under white noise 


being due to relatively weak AF inhibition. 


(iv) Differential Control Loops 


If we compare the added frequency effect with induced 
speech disturbances of the more familiar kind associated 
with AF inhibition studies, there appear to be two distinctly 
different types of effect when AF is inhibited. One is the 
previously reported intermittent disruption of articulatory 
segments, and the other is the alteration in voice character- 
istics. The latter would presumably arise from changes in 
activity of the laryngeal muscles, as already noted. The 
fact that interference with AF causes alterations in speech 
output implies that such feedback is an element in the con- 
trol of speech production. But the two kinds of disturbance 
in speech output suggest that two different feedback path- 
ways may be implicated under AF masking. 

Interference with highly organized levels of speech 
required for segment articulation yields intermittent effects 
of the misarticulation type noted by all investigators. This 
suggests either an unregulated open loop, IIA, or else a 
closed loop, IIb, with a relatively low functional load, 
operating at a relatively high brain level. By “unregulated 
open loop" we mean to suggest a central motor program with 
some vulnerability to disturbance in the presence of extraneous 


signals. 


The highly stable voicing effects observed in our 
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experiments, perhaps associated with previously reported 
changes in fundamental frequency, suggest a different loop, 
I. The characteristic stability of presumed loop I effects 
observed in our experiments suggest that this is a closed 
loop. Such a loop would presumably have its effect at 
lower brain centres than would loop II, and be involved in 
the ~control of -votce “production: 

Our results are thus consonant with the familiar 
distinction between speech and voice. However, the distinc- 
tion between articulatory segment and voice character effects 
raises difficulties when we attempt to assess the role of 
AF in speech control. These difficulties are addressed in 


the following sections. 


(v) Laryngeal Proprioception 


The joints and intrinsic muscles of the larynx are 
known to be supplied with proprioceptive end organs (Wyke, 
1969). We have observed indications of sustained tension in 
the laryngeal region under AF inhibition, and the added 
frequency components seen in the spectrographic data are 
presumed to be direct results of increased muscular tension 
in the larynx. These effects might be interpreted as 
evidence of an extra loading on the laryngeal proprioceptive 
system which might result in an afferent compensation for 
LOSCt™AF: 

A compensatory mechanism with this function would, 
under a closed loop hypothesis, explain why speech following 
anacusis retains efficient motor control. Indeed, speech 


therapists give particular emphasis to kinesthesia in speech 
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PLOductIOn jin eke deat.) sUntortunately,. we do not have: a 
spectrographic data on post-anacusis speech; it would be 
interesting to see whether or not the added frequency com- 
ponents observed under AF masking appear in this form of 
speech. The point is worth considering since the possibility 
is that the added frequency components seen in our experi- 
ments may evince an attempt to shift information-bearing 
components of the spectrum above the otherwise effective 
autogenic masking noise. 

The interpretation that we have placed on our data 
concerning laryngeal tension, including the appearance of 
new frequency components, would require us to view loop I 
as capable of being closed by either acoustic or pro- 


prioceptive afferents. 


(yaa) integrated Control Mechanisms 


The possibility of alternative closing for loop I 
suggests that the significance of AF might not be readily 
appreciated if we view it in isolation. The suggestion 
that proprioceptive activity may increase in partial compen- 
sation for lost AF would tend to emphasize the significance 
of loop closure rather than the sensory modality in the 
loop. After laryngectomy AF is normally available, while 
post-anacusis speech has recourse to kinesthesia from pro- 
prioception. Thus a direct testing for a closed loop of the 
kind we have postulated, and an assessment of its functional 
significance, would require deprivation of laryngeal pro- 


prioception in combination with AF masking. In this respect, 
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it would be interesting to observe laryngectomees under AF 
masking. 

In our experiments binaural AF masking with autogenic 
noise combined with laryngeal vibration produced the most 
severe disruptions of speech. With the exception of one 
teenage subject whose general unresponsiveness was noted 
in Chapter Four, all subjects under this experimental condi- 
tion, including the two adults who served in the pilot 
studies, appear to have experienced a sense of diminished 
motor control and a marked tendency to thought blockage. It 
would be interesting to determine by further experiments 
whether prolonged experimental treatment would tend to show 
further decrement or an improvement due to accommodation. 
Meanwhile, the evidence that we have from the present experi- 
ments has some value. If we accept the hypothesis that in- 
creased proprioceptive feedback acts in response to AF loss, 
then our experimental condition II(b) would create a severe 
afferent inhibition. The fact that this experimental condi- 
tion produced the greatest disturbance in speech production 
is; at©any rate; in conformity with’ the alternative closure 
hypothesis. Under this hypothesis, the fact that speech shows 
deterioration after anacusis and under AF masking would 


suggest that proprioception acts less efficiently than AF. 
(vii) Immediate Onset Effects 


The above discussion of possible compensatory response 
to AF loss does not account for an important experimental 


effect: the immediate onset of speech disturbance under AF 


” 
7 
a j ‘ - 
wee } eee 1.2 u 2 4 
7 : im La | 
3 


otnsposdé ddiw pableen 4A Le: 
feom ert bboubex4. aoliexdiv iso pnyis 
sno to mofsqeaoxs eit id bw .fip9sqa. to | efiok: | 
baton 26W Jannat beaaqasene fsxensp arodw stoet 
-ihnoo Isjtasmiteqxs eidt+ tebao etoetdie Lis moet arg 
joliq ‘efi ai bevise ow etiubs ows edd onébuloak 0k3 a 
bedetnimth to senez 6 besndizeqxe eved oF 189996 note - : 
$f .spexXoold tripods of youebnst+ beAzem s bas Lozdaoo xotom 7 : 
ethenixsyxo tedtaru? yd omterteted of puiteerstai ed Biuow ~~ 


wode of baad bluow 4aomisatt Cs¢nemizeqes bepaoloxg siviihe . | Y 
_nottphommoons of aub tnamavozgmi as 29 dnemexseb xetit 20? int 
-izoaxe jnoaetq ods mort eve ow ted3 sonmsbive sit oLikwaseM an 
-ni ead eteorltodvd ond tqecps sw 2T .eulsv smon gad. esnem 
,2a0f YA ot sanogaot ni atos Adsdbest evisgesoiagorq Beasexo, 
exever 6 Sstsots bluow (cd) IT aolstibieo Lsetnemisegqxs wo aeds, 
-ifnoo Lstnemitegqxe eft? Jedd tort ed? «notttdiink: gnesete, | | 
notdouboxrg tioseqe oi sonsdxutaib sagssoxp sis besovbomwg mold 
eryeals avijeristis ont diiw ysimtoiao nt \oter yas sa ek 
ewoile dossqe taft Jost ot ,etestidoqyd aids ashboU -atesdseqyn, 
bivow pnidasm TA wsbayv bas atetroprie tetis nosteaoiteseb, al 

TA ted ylinetoLite sesl e355 pena teas eteiatts | | 


84 


masking as opposed to its slow onset following anacusis. 

In our experiments, including the pilot studies, most 
subjects volunteered an introspective report of a kind which 
seemed insufficiently precise to warrant inclusion in 
Chapter Four. The tenor of these reports suggests the 
possibility that a speaker has recourse to an awareness of 
his speech other than from direct AF or proprioceptive feed- 
back, and that the immediate basis for this awareness is 
an endogenic acoustic image or acusma for the sound of his 
voice. Reports along these lines were most commonly made 
in relation to experimental condition II(b), though they 
were also made under condition II(a): binaural autogenic 
masking. 

If we assume that in addition to increased proprio- 
ceptive feedback, the post-anacusis speaker has recourse 
to such acusmata, we might be able to explain the preserva- 
tion of normal speech for several months after the onset of 
anacusis. This would imply that feedback may be provided 
by the acusmata independently of the exogenic AF. The later 
collapse of fine control in speech would have to be explained 
by assuming the maintenance of acusmata is dependent on 
reinforcement by the exogenic wave-form. 

In section (iii) of the present chapter, we argued 
that a response artefact hypothesis would not account for 
speech disturbances observed under AF Masking. This argument 
referred to the possibility that the observed speech dis- 


turbances are not due to sensory inhibition in a closed loop 
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system but to a non-specific disturbance caused by a strong 
sensory input. We wish to let this argument stand, but we 
do wish to introduce a closely related hypothesis. If the 
acusmata that we have entertained thus far occur at all, 
then they must arise from neuronal activity in the cerebral 
auditory nuclie. If, under our AF masking conditions, the 
masking noise triggers neural responses which drown the 
activity of nuclei responsible for the acusmata, then the 
latter would not be available as a source for endogenic AF. 
This would explain the immediate onset of speech disturbance 


under our AF masking. 


(viii) Future experiments 


It is unfortunate that the limitations of the experi- 
mental equipment prevented instrumental analysis of the 
output data from the vibrator experiments. However, the 
effect is clearly important, if not well defined, in the 
present experiments. In future, experiments might be devised 
using better techniques and equipment so as to prevent the 
sound, radiating from the vibrator, appearing on the data 
tapes. 

One implication of the results reported is that 
future experiments in this field will probably involve the 
study of consistent underlying effects rather than the more 
obvious but variable effects studied in the past. This 
implies the detailed studies of speech output of relatively 
small numbers of subjects. The alternate approach involves 


phenomenological study which concentrates on analyzing highly 
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variable disruptions of speech in large numbers of subjects. 


On the basis of previous results, this approach will not 


yield information leading to understanding of the basic 


mechanisms. Ringel, one of the leading workers in this 


field, has published a study which appeared during the pre- 


paration of this thesis (Horii et al, 1973) which indicates 


a similar approach. 


(ix) Summary 


Ls 


Autogenic noise at 85 dB above threshold appears to pro- 
vide an effective masking for air and bone conducted AF. 
Added frequency components observed under AF masking 
appear to be best explained in associated with an in- 
creased muscular tension in the larynx. This tension is 
interpreted as evidence of increased laryngeal proprio- 
ceptive traffic in compensation for AF loss. 

The added frequency components, when considered together 
with changes in articulation, suggest two distinct types 
of change under AF. One change affects highly organized 
aspects of speech associated with a relatively high 
level of brain activity; the other change affects voice 
quality, which would appear to be mediated by lower 


brain centres. 


The intermittent effect of AF masking on articulation 
segments suggest that articulation control is provided 
by an somewhat unstable open loop or a closed loop of 
relatively low functional load. The stable effect of AF 


masking on voice quality suggests a closed loop control. 
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AF and proprioceptive feedback appear to be represented 
in the voice control loop, with proprioceptive increase 
in partial compensation for AF loss, though AF is more 
important for laryngeal motor control than is proprio- 
ceptive feedback. 

Two distinct stages of speech after sudden anacusis 
appear to be explicable in relation to effects observed 
in the AF masking experiments. An hypothesized endogenic 
AF dependent on reinforcement from exogenic AF may 
support normal speech immediately after sudden anacusis, 
but amnesia for this feedback source will appear beyond 
the reinforcement period, thus giving rise to speech 
disturbances comparable to those seen in AF masking. In 
the experimental situation, the neural source of endo- 
genic AF is presumed to be drowned by neural response to 
the exogenic masking signal, thus providing a basis for 
immediate onset of speech disturbance. 

The superiority of autogenic masking over white noise 
masking is most readily explained by the fact that 
autogenic noise is spectrally shaped to fit the AF. 
Possibilities of lateralization effects for the two 
masking agents are considered to account for the 
differential effect of white noise for subject discom- 
fort and AF masking. 

Evidence of AF contribution to motor control in speech 
suggests that more than one loop may be implicated. 


Until more detailed investigations have been made, the 
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significance and mode of integration between such loops 
prevents a clear assessment of the significance of AF. 
Recommendations are made for further studies on various 
aspects brought to light in the present research. One 
important problem for further study concerns possible 


sources for subject x treatment interactions. 


89 
APPENDIX A 


The following material was presented to the subjects in 
the same order for each condition. For conditions in which 
the laryngeal vibrator was used, only the passage of prose 
was presented. Each section was preceded by the instruction 


written on a separate page. 


Page l. Please follow the instructions which will be 
presented here. Try to speak in your usual way 


without shouting. 


1. Please answer the following questions aloud, 


making a sentence for each. 


Page 2. What is your name? 
Who is the mayor of the City of Edmonton? 


From what vegetable would you make a Halloween 
Jack O'Lantern? 


What are the names of three of Canada's largest 
cities? 


What is the name of the river on which Edmonton 
is’ situated? 


Can you say the first line of the National Anthem? 


What province is west of Alberta, and on what 
ocean is it's coast? 


Page 3. Please read the following passage aloud in your 


usual manner. 


Page 4. Otters are usually nocturnal and elusive. Some 


years have passed since I found a place near tidal 
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waters where, at dusk at this time of year, I 

was able to watch an otter family at play on 
successive occasions. The present summer 
weather has provided me with a daytime sighting 
of an otter. I had reached the shade of some 
trees that overhung a cool stream and it was 
while enjoying a state of heat-induced inactivity 
there that I became aware of the movement of an 
animal with a large tail, on the opposite bank. 
At this point the stream consisted of shallows 
separated by deeper pools. The trees grew from 

a hedge which formed the opposite bank of the 
stream; one of these trees had fallen. The move- 
ment that I had seen was beside the upturned roots 
of this tree. Soon the ater came through an 
opening and scurried along the waterside. Then 
with quick movements it returned to the hole 
among the roots, only to re-emerge later with a 


trout in its jaws. 


Page 5. Please look at the following pictures. Describe 


what you see - talk about them in any way you wish. 


Page 6-9 Pictures mounted on paper, which contained enough 
activity or scenery to make description as 


straightforward as possible. 
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